Robust speech recognition based on noise and SNR classification - a multiple-model framework

نویسندگان

  • Haitian Xu
  • Zheng-Hua Tan
  • Paul Dalsgaard
  • Børge Lindberg
چکیده

This paper presents a multiple-model framework for noiserobust speech recognition. In this framework, multiple HMM model sets are trained each identified by a noise type and a specific Signal-to-Noise Ratio (SNR) value. This, however, does not increase the computational complexity of the recognition process since only one model set is selected according to the noise classification and SNR estimation. The optimal number of model sets is first identified on the basis of the Aurora 2 database. With only three model sets for each noise type, the framework shows superior performance to Multi-style TRaining (MTR) when testing on known noise types but lower performance on unknown noise types. To overcome this drawback, a modified Jacobian method is proposed to adapt the selected HMM models to the test environment. Furthermore, given the fact that MTR often gives relatively stable performance for unknown noise types, a combined technique is applied in which interpolation between the MTR and the adapted models is performed. This combined technique gives more than 24% performance improvement as compared to MTR.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

روشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه

Performance of speech recognition systems is greatly reduced when speech corrupted by noise. One common method for robust speech recognition systems is missing feature methods. In this way, the components in time - frequency representation of signal (Spectrogram) that present low signal to noise ratio (SNR), are tagged as missing and deleted then replaced by remained components and statistical ...

متن کامل

Speech enhancement based on hidden Markov model using sparse code shrinkage

This paper presents a new hidden Markov model-based (HMM-based) speech enhancement framework based on the independent component analysis (ICA). We propose analytical procedures for training clean speech and noise models by the Baum re-estimation algorithm and present a Maximum a posterior (MAP) estimator based on Laplace-Gaussian (for clean speech and noise respectively) combination in the HMM ...

متن کامل

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...

متن کامل

Noise spectrum estimation using Gaussian mixture model-based speech presence probability for robust speech recognition

This work presents a noise spectrum estimator based on the Gaussian mixture model (GMM)-based speech presence probability (SPP) for robust speech recognition. Estimated noise spectrum is then used to compute a subband a posteriori signal-to-noise ratio (SNR). A sigmoid shape weighting rule is formed based on this subband a posteriori SNR to enhance the speech spectrum in the auditory domain, wh...

متن کامل

Speech recognition system robust to noise and speaking styles

It is difficult to recognize speech distorted by various factors, especially when an ASR system contains only a single acoustic model. One solution is to use multiple acoustic models, one model for each different condition. In this paper, we discuss a parallel decoding-based ASR system that is robust to the noise type, SNR, speaker gender and speaking style. Our system consists of two recogniti...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005